speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Diffusion Language Models for Speech Recognition

Add code
Apr 15, 2026
Viaarxiv icon

Pushing the Limits of On-Device Streaming ASR: A Compact, High-Accuracy English Model for Low-Latency Inference

Add code
Apr 16, 2026
Viaarxiv icon

Interactive ASR: Towards Human-Like Interaction and Semantic Coherence Evaluation for Agentic Speech Recognition

Add code
Apr 14, 2026
Viaarxiv icon

Speaker Attributed Automatic Speech Recognition Using Speech Aware LLMS

Add code
Apr 13, 2026
Viaarxiv icon

Teaching the Teachers: Boosting unsupervised domain adaptation in speech recognition by ensemble update

Add code
Apr 13, 2026
Viaarxiv icon

BlasBench: An Open Benchmark for Irish Speech Recognition

Add code
Apr 12, 2026
Viaarxiv icon

Empowering Video Translation using Multimodal Large Language Models

Add code
Apr 13, 2026
Viaarxiv icon

When Does Data Augmentation Help? Evaluating LLM and Back-Translation Methods for Hausa and Fongbe NLP

Add code
Apr 14, 2026
Viaarxiv icon

Cross-Cultural Bias in Mel-Scale Representations: Evidence and Alternatives from Speech and Music

Add code
Apr 12, 2026
Viaarxiv icon

SEDTalker: Emotion-Aware 3D Facial Animation Using Frame-Level Speech Emotion Diarization

Add code
Apr 14, 2026
Viaarxiv icon